Search CORE

28 research outputs found

LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task

Author: Berard Alexandre
Besacier Laurent
Pietquin Olivier
Publication venue
Publication date: 17/07/2017
Field of study

This paper presents the LIG-CRIStAL submission to the shared Automatic Post- Editing task of WMT 2017. We propose two neural post-editing models: a monosource model with a task-specific attention mechanism, which performs particularly well in a low-resource scenario; and a chained architecture which makes use of the source sentence to provide extra context. This latter architecture manages to slightly improve our results when more training data is available. We present and discuss our results on two datasets (en-de and de-en) that are made available for the task.Comment: keywords: neural post-edition, attention model

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model

Author: Berard Alexandre
Koishekenov Yeskendir
Nikoulina Vassilina
Publication venue
Publication date: 07/07/2023
Field of study

The recently released NLLB-200 is a set of multilingual Neural Machine Translation models that cover 202 languages. The largest model is based on a Mixture of Experts architecture and achieves SoTA results across many language pairs. It contains 54.5B parameters and requires at least four 32GB GPUs just for inference. In this work, we propose a pruning method that enables the removal of up to 80% of experts without further finetuning and with a negligible loss in translation quality, which makes it feasible to run the model on a single 32GB GPU. Further analysis suggests that our pruning metrics can identify language-specific experts

arXiv.org e-Print Archive

Transferts de champs entre maillages de type éléments finis et applications numériques en mécanique non linéaire des structures

Author: Berard Alexandre
Publication venue: HAL CCSD
Publication date: 16/09/2011
Field of study

In continuum mechanics, when a problem is solved with the finite element method, field are known on nodes or on integration points, on a given mesh of the structure. If we which to use these results to perform a calculation on a second mesh, a data transfer is inevitable, especially in studies which imply adapting mesh process, or for coupling several codes. Numerical simulation must take this fact into account, which is not entirely the case today. So R&D division of EDF is eager to use some tools to remove this lock, in the software Code_Aster.There is a sum up of the work dine during the thesis. The objectives are the following: propose some methods for fields transfers, compare and describe these different approaches with theoretical analysis and numerical errors, implement one of these methods in Code_Aster, validate this implementation on some industrial cases.En mécanique des milieux continus, la résolution d'un problème à l'aide de la méthode des éléments finis permet d'obtenir des champs discrétisés aux noeuds ou aux points de Gauss, sur un maillage donné de la structure étudiée. Si l'on souhaite utiliser ces résultats afin d'effectuer un calcul sur un second maillage, un transfert de données est inévitable, notamment dans les études chaînées, lors de processus d'adaptation de maillage ou encore pour des couplage entre codes. La simulation numérique doit tenir compte de cet état de fait, ce qui n'est pas totalement le cas aujourd'hui; la division R&D d'EDF souhaite donc disposer d'outils permettant de lever ce verrou au sein du logiciel libre Code_Aster.Le manuscrit présente une synthèse des travaux menés durant la thèse, qui répondent aux objectifs suivants: proposer des méthodes de transfert de champs, comparer et qualifier ces différentes approches à l'aide d'ananlyses d'erreur théoriques et numériques, implanter l'une de ces méthodes dans Code_Aster, valider cette programmation sur quelques cas industriels

HAL-uB

Thèses en Ligne

HAL - Université de Franche-Comté

HAL-CEA

Naver Labs Europe's Participation in the Robustness, Chat, and Biomedical Tasks at WMT 2020

Author: Berard Alexandre
Calapodescu Ioan
Nikoulina Vassilina
Philip Jerin
Publication venue
Publication date: 19/11/2020
Field of study

Edinburgh Research Explorer

NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track

Author: Berard Alexandre
Boito Marcely Zanon
Calapodescu Ioan
Gow-Smith Edward
Publication venue
Publication date: 13/06/2023
Field of study

This paper presents NAVER LABS Europe's systems for Tamasheq-French and Quechua-Spanish speech translation in the IWSLT 2023 Low-Resource track. Our work attempts to maximize translation quality in low-resource settings using multilingual parameter-efficient solutions that leverage strong pre-trained models. Our primary submission for Tamasheq outperforms the previous state of the art by 7.5 BLEU points on the IWSLT 2022 test set, and achieves 23.6 BLEU on this year's test set, outperforming the second best participant by 7.7 points. For Quechua, we also rank first and achieve 17.7 BLEU, despite having only two hours of translation data. Finally, we show that our proposed multilingual architecture is also competitive for high-resource languages, outperforming the best unconstrained submission to the IWSLT 2021 Multilingual track, despite using much less training data and compute.Comment: IWSLT 2023: Tamasheq-French and Quechua-Spanish challenge winne

arXiv.org e-Print Archive

SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages

Author: Berard Alexandre
Besacier Laurent
Brun Caroline
Henderson James
Mohammadshahi Alireza
Nikoulina Vassilina
Publication venue
Publication date: 20/10/2022
Field of study

In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation. To overcome the "curse of multilinguality", these models often opt for scaling up the number of parameters, which makes their use in resource-constrained environments challenging. We introduce SMaLL-100, a distilled version of the M2M-100 (12B) model, a massively multilingual machine translation model covering 100 languages. We train SMaLL-100 with uniform sampling across all language pairs and therefore focus on preserving the performance of low-resource languages. We evaluate SMaLL-100 on different low-resource benchmarks: FLORES-101, Tatoeba, and TICO-19 and demonstrate that it outperforms previous massively multilingual models of comparable sizes (200-600M) while improving inference latency and memory usage. Additionally, our model achieves comparable results to M2M-100 (1.2B), while being 3.6x smaller and 4.3x faster at inference. Code and pre-trained models: https://github.com/alirezamshi/small100Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive

Field transfers between finite element meshes and numerical applications in non linear mechanics

Author: Berard Alexandre
Publication venue
Publication date: 19/09/2011
Field of study

En mécanique des milieux continus, la résolution d'un problème à l'aide de la méthode des éléments finis permet d'obtenir des champs discrétisés aux noeuds ou aux points de Gauss, sur un maillage donné de la structure étudiée. Si l'on souhaite utiliser ces résultats afin d'effectuer un calcul sur un second maillage, un transfert de données est inévitable, notamment dans les études chaînées, lors de processus d'adaptation de maillage ou encore pour des couplage entre codes. La simulation numérique doit tenir compte de cet état de fait, ce qui n'est pas totalement le cas aujourd'hui; la division R&D d'EDF souhaite donc disposer d'outils permettant de lever ce verrou au sein du logiciel libre Code_Aster.Le manuscrit présente une synthèse des travaux menés durant la thèse, qui répondent aux objectifs suivants: proposer des méthodes de transfert de champs, comparer et qualifier ces différentes approches à l'aide d'ananlyses d'erreur théoriques et numériques, implanter l'une de ces méthodes dans Code_Aster, valider cette programmation sur quelques cas industriels.In continuum mechanics, when a problem is solved with the finite element method, field are known on nodes or on integration points, on a given mesh of the structure. If we which to use these results to perform a calculation on a second mesh, a data transfer is inevitable, especially in studies which imply adapting mesh process, or for coupling several codes. Numerical simulation must take this fact into account, which is not entirely the case today. So R&D division of EDF is eager to use some tools to remove this lock, in the software Code_Aster.There is a sum up of the work dine during the thesis. The objectives are the following: propose some methods for fields transfers, compare and describe these different approaches with theoretical analysis and numerical errors, implement one of these methods in Code_Aster, validate this implementation on some industrial cases

Theses.fr

Monolingual Adapters for Zero-Shot Neural Machine Translation

Author: Berard Alexandre
Besacier Laurent
Gallé Matthias
Philip Jerin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 16/11/2020
Field of study

International audienceWe propose a novel adapter layer formalism for adapting multilingual models. They are more parameter-efficient than existing adapter layers while obtaining as good or better performance. The layers are specific to one language (as opposed to bilingual adapters) allowing to compose them and generalize to unseen language-pairs. In this zero-shot setting, they obtain a median improvement of +2.77 BLEU points over a strong 20-language multilingual Transformer baseline trained on TED talks

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Transfert de champs entre maillages de type éléments finis et applications numériques en mécanique non linéaire des structures

Author: Berard Alexandre
Cano Valerie
Hild Patrick
Meunier Sebastien
Publication venue: HAL CCSD
Publication date: 09/05/2011
Field of study

National audienceSee http://hal.archives-ouvertes.fr/docs/00/59/27/79/ANNEX/r_EYT028KT.pd

HAL-uB

HAL - Université de Franche-Comté

HAL-CEA

ON THE EARTHQUAKE ACTIVITY IN THE DEEPER ZONE OF SAKURAZIMA

Author: Berard Alexandre
Besacier Laurent
Boito Marcely Zanon
Villavicencio Aline
Publication venue: Disaster Prevention Research Institute, Kyoto University
Publication date: 01/03/1969
Field of study

Since eraly houre of the 29th of May, 1968, a great many felt earthquakes have occurred.So, the precise seismometric observation using the data-recorder and the other instruments werecarried out.The results of the investigation on these earthquakes can be summarized as follows:I) Supposing that the underground structure is homogeneous and having the averragevelocity of 2 Km/sec. Or 3 Km/sec. For P wave, the epicenters of these earthquakes are estimated tobe distributed from the center to the east part of Sakurajima and to be 2-15 Km deep.2) The push pull distribution of P wave of these earthquakes does not prove any regularity.3) The coefficient of Isbimato-Lida's empirical formula, m, is 1.8 which is equivalent to oneof the deeper zone earthquakes in the volcano.4) Since the occurrence of the eatbquake swarm, the surface phenomena of the volcano havenot shown the conspicious change, and the shallow zone earthquakes near the crater have notoccurred so many too.Since eraly houre of the 29th of May, 1968, a great many felt earthquakes have occurred.So, the precise seismometric observation using the data-recorder and the other instruments werecarried out.The results of the investigation on these earthquakes can be summarized as follows:I) Supposing that the underground structure is homogeneous and having the averragevelocity of 2 Km/sec. Or 3 Km/sec. For P wave, the epicenters of these earthquakes are estimated tobe distributed from the center to the east part of Sakurajima and to be 2-15 Km deep.2) The push pull distribution of P wave of these earthquakes does not prove any regularity.3) The coefficient of Isbimato-Lida's empirical formula, m, is 1.8 which is equivalent to oneof the deeper zone earthquakes in the volcano.4) Since the occurrence of the eatbquake swarm, the surface phenomena of the volcano havenot shown the conspicious change, and the shallow zone earthquakes near the crater have notoccurred so many too

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Kyoto University Research Information Repository

White Rose Research Online

Hal-Diderot